Combining Signals for Enhanced Alpha

Features and Labels

Criteria Meet Specification

What do you observe in the rolling autocorrelation of labels shifted?

Describe the relationship between the shifted labels.

Train/Valid/Test Splits

Correctly implement the train_valid_test_split function.

Random Forests

Criteria Meet Specification

Why does dispersion_20d have the highest feature importance, when the first split is on the Momentum_1YR feature?

Describe why dispersion_20d has the highest feature importance, when the first split is on the Momentum_1YR feature.

1) What do you observe with the accuracy vs tree size graph? 2) Does the graph indicate the model is overfitting or underfitting? Describe how it indicates this.

Describe how the accuracy changes over time and what indicates the model is overfitting or underfitting.

Overlapping Samples

Criteria Meet Specification

Drop Overlapping Samples

Correctly implement the non_overlapping_samples function.

Use BaggingClassifier's max_samples

Correctly implement the bagging_classifier function.

Build an ensemble of non-overlapping trees: OOB Score

Correctly implement the calculate_oob_score function.

Build an ensemble of non-overlapping trees: Non Overlapping Estimators

Correctly implement the non_overlapping_estimators function.